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Abstract —We here provide a method for systematic encoding 
of the Multiplicity codes introduced hy Kopparty, Saraf and 
Yekhanin in 2011. The construction is huilt on an idea of Kop¬ 
party. We properly define information sets for these codes and 
give detailed proofs of the validity of Kopparty’s construction, 
that use generating functions. We also give a complexity estimate 
of the associated encoding algorithm. 

Index Terms —Locally decodahle codes, locally correctable 
codes, Reed-Muller codes. Multiplicity codes, information set. 

I. Introduction 

Locally decodable codes (LDC) allow one to probabilisti¬ 
cally retrieve one symbol of a message by looking at only a 
small fraction of its encoding. They were formally introduced 
by Katz and Trevisan in 2000 H]. When the local decoding 
algorithm retrieves a symbol of the codeword instead of a 
message symbol, one speaks of locally correctable codes 
(LCC). Eor an extensive treatment of locally decodable and 
correctable codes, we refer the reader to a. 

Eor C to be an LCC code, it is only required to have 
C defined as C C F^, while the notion of an LDC code 
requires that C is provided with an encoding Enc : ^ F”. 

Considering codes which are F^-linear subspaces of F”, there 
is a reduction making an LDC code from an LCC code ^ 
Lemma 2.3]. This reduction heavily relies on the notion of 
Information Set. 

A breakthrough of Kopparty, Saraf and Yekhanin a is a 
construction of high-rate LCCs with sublinear locality. These 
codes were termed Multiplicity Codes, and generalize the 
Reed-Muller codes, using derivatives. 

A technical and practical issue remains, which is to make 
these codes LDCs. Lor these codes, the message space and 
the codeword space do not share the same alphabet, so the 
standard reduction from ||2l Lemma 2.3] can not be applied. 
The problem was circumvented in [Sl by using concatenation. 

It is well known that LDCs can be used to build Pri¬ 
vate Information Retrieval (PIR) schemes, using a standard 
equivalence between LDCs and PIRs Q. In a, for the very 
particular case of Reed-Muller codes and Multiplicity codes, a 
better usage of these locally decodable codes in PIR schemes 
was introduced, using a partitioning of the m-dimensional 
affine space into few affine hyperplanes. The concatenation 
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solution provided by a appears not helpful in this context, 
since it more or less breaks the underlying affine geometry. 

In the Appendix of a, Kopparty described an idea to make 
a systematic encoding for Multiplicity codes. We clarify the 
idea of 0, providing notation and proofs, and solve a unicity 
problem, necessary to have a valid systematic encoding. 

II. Problem statement 

Let q = p* for some t € N* and p prime. We enumerate 
the field with q elements as Fg = {ao,ciii ■ • • Con¬ 

sidering, for m € N*, m indeterminates Xi,..., Xm and m 
positive integers A, • ■ ■, *m, we use the short-hand notation 

X = {Xi,...,X^) =Xf 

Fg[X] =Fg[Xl,...,X,„] i= eN"*, 

|i|=AH - Vim P = (pi,... e F™, 

i.e. we use bold symbols for vectors, points, etc, and stan¬ 
dard symbols for uni-dimensionnal scalars, variables, etc. We 
denote by 

Fg[X]d = {PeFg[X]; degP<d}. 

We also let V = F^ = {Pi,..., Pn}, where n = q"^. 

A. Reed-Muller codes over Fg and information sets 
We define the following evaluation map 

ev: Fg[X:] ^ F^ 

F ^ (P(Pi),...,P(P„)). 

For an integer d > 0, we denote by Fg[X]d the set of polyno¬ 
mials of degree less than or equal to d, which has dimension 
0. We can now recall the definition of 
Reed-Muller codes over Fg, also called Generalized Reed- 
Muller codes 0: 

Definition 1 (Reed-Muller codes over Fgj.' For d < m(^q — 
1), the order Reed-Muller code over Fg, RM^ is 

RMd = {ev(P) |PeFg[X]4. 

From now on, we omit “over Fg” and simply say Reed-Muller 
codes. The evaluation map ev maps symbols into n 

symbols. However, when d > q, the map ev is not injective, 
and the dimension of RM^ is less than or equal to ■ 



A codeword c G RM^ can be indexed by integers as c = 
(ci,..., Cn) or by points as c = (cp^,..., cp^), where a = 
cp. = F{Pi). 

Definition 2 (Information set): Let C be an [n, fc] linear 
code over F^. An information set of C is a subset I C 
{1,..., n} such that the map: 

(p: C ^ (1) 

C I—> (Ci)igl 

is a bijection. 

J.D. Key et al. m gave information sets for Reed-Muller 
codes, that we recall in the following Theorem. 

Theorem 1 (^): An information set of RM^ is 

m 

D <ii < q-l<l<m\ '^ii<d>. 

1=1 } 

We denote this particular information set by Id, with Id C V. 
Denote by 

m 

= {(«!, • • ■, ^m) I 0 < 2/ < g - 1, I <l = d}, 

1^1 

m 

Fd = {{ii, I 0 < iz < g - 1, 1 < I < m, < d}, 

1=1 

then we have kd = dim(RMd) = \Cd\ = J2‘i=o \^i\ (s®® 0)- 
B. Multiplicity codes 

First we recall the notion of Basse derivative for multi¬ 
variate polynomials. We write polynomials F G Fg[X] = 
¥q[Xi,..., Xm] without parentheses and without variables, 
and F(X) (resp. F{P)) when the evaluation on indetermi- 
nates (resp. points) has to be specified. Given i,jG N"*, we 
denote: 

• * < J if < ji for alH = 1,..., m, 

• i < j if i < j and ii < ji for some 1 < I < m. 

Given i G N"*, and F G Fg[X], the i-th Hasse derivative of 
F, denoted by H[F,i), is the coefficient of Z* in the poly¬ 
nomial F(X + Z) G Fg[X, Z\, where Z = (Zi ,..., Zm)- 
More specifically, let F{X) = J2j>o 

Fix + Z) = J2 MX + zy - ^ HiF,i)iX)Z\ 

3 ^ 

where 

HiF,i)iX)=Y,f^(^^X^-\ 

j>i ^ 

with 

*/ V*i/ \ii 
Given F,G G Fg[X] and i G N™, we have (Leibniz rule 121): 

HiF-0,1)= ^ HiF,k) ■ HiG,i-k) (2) 

0<fc<i 

Now, given a derivation order s > 0, we introduce an extended 
notion of evaluation. For a given s > 0, there are cr = 


Hasse derivatives of a polynomial F : HiF,i), i G N'", |i| < 
s. Denote by S' = {j G N™; \j\ < s}, and let E = Fg. An 
element a; G E is written as 

X - ixj)j^s, Xj G Fg. 

We generalize the evaluation map at a point P: 

evj, : Fg[X] ^ E 

F ^ iHiF,v)iP))^^s 

and the total evaluation rule is 

ev'* : Fg[X] E"* 

F ^ (ev|,^(F),...,ev|,jF)). 

Definition 3 (Multiplicity Codes SSI): Given the above 
evaluation map and a degree d < sq, the corresponding 
Multiplicity code is 

Mult^ = {ev*(F) |FGFg[X]4. 

In the context of 13 the constraint d < sq is required to ensure 
that ev'* restricted to Fg[X]d is injective. 


C. Information sets of Multiplicity codes 

The difficulty in defining an information set for a multiplic¬ 
ity code properly is that the Fg-symbols of the message space 
are not the same as the Fg-symbols of the codeword space. 
Recall that a codeword c G E" can be indexed by points 
P GV: 

C = (cpjjjgy , Cp G E. 

Each Cp can be written cp = ((cj)p)jgs, hence we can write 
c = icj^p)jes,Pev- 

We can now define information sets of Multiplicity Codes: 

Definition 4 (Information set of a Multiplicity Code): An 
information set of Mult^ is a set I C S x F™ such that the 
mapping 

f: Mult^ ^ Fj 

C !-)■ (Cj.p)(j,P)6l 


is bijective. 

InS, an information set I of Mult^ based on information sets 
of Reed-Muller codes was suggested, namely, I = ij,Idj )jeS 
where Id^ is the information set of the dj-th order Reed- 
Muller code as in @, where the degree dj is 

dj = min(m ((7 -l),d- jq), j G S. 

We prove that I is an information set in the next two Sections. 


III. Systematic encoding algorithm 
A. A polynomial decomposition 

Given a multi-index j = let Vj be the 

polynomial 

m 

Vi = X{{xt-x.r. 

The following decomposition is given in 0 without proof. 

Lemma 1: Let F € Fq[X] have total degree less than or 
equal to d, then F can be written as 

E (3) 

\3\<d/q 

for some polynomials Fj G Fq[X]d^. There also exists a poly¬ 
nomial Fj^ where |jq| = \d/q\ and deg{Fj^) = d — [d/q\q. 
Proof: We consider a multivariate monomial 

... Xl^ and write Ui = tiq + ri for all i = 1,..., m. 
First, we consider just X^^: 

• if = 0, since ri < q, we do not need to prove anything; 

• if G > 0, we have; 

• {{Xl - Xi) + 

tl 

= ^ + 
i=0 

Similarly, we recursively apply the above reduction with 
where i = 0,... ,ti, so we hnally obtain: 

t'l 

xr = J2^iAx^)-{xf-x,y 

i^O 

= p^tAXi) ■ {xf - Xi)‘i + E • W - 

i^O 

where deg (Pi A < q — 1 for i = 0,... ,t[, for some t[. We 
see that 

deg(Pi,i(Xi) • {Xf - XiY) < q{i + 1) < qt[ < m, 

for alH = 0,..., — 1, so the term of degree ui = deg(X“^) 

belongs to ■ {Xf-Xif'^ hence deg(Pi_t') = mi - 

qt'i =riF q{ti - t[). 

Since 0 < ri, as deg(Pi_i'^) < q — 1, it follows that ti—tf = 
0, so we have deg(Pi_tj) < min(q' — 1, ui — qti). Doing the 
same thing with the other variables X 2 , ■ ■ ■, Xm, we obtain: 

• • • X“- = I (Xl) • {Xf - xyy^ 

\*i-0 

tm 

^ ^ PmArn. (^m) * ~ ^m) 

i<(tl. -Am) 

= Pi(Xi,X2,...,X„)-yi(Xi,X2,...,X™), 

i=0 


where Bi{X) = Pi^i^{Xi) ■ ■ ■ Pm,i^{X^) and deg(Pi) = 
E^ideg(Pj.i,.) ^ min(m(g - - qij)) = 

min(m(g — 1), {YAjLi 'u-j) — I*!?)- Since a multivariate poly¬ 
nomial is the sum of multivariate monomials, we obtain the 
result. We also note that if there would not exist an Fj^ such 
that IjqI = \d/q\ and deg{Fj^) = d— [d/q\q, then the degree 
of the RHS of 0 would not be equal to deg(P). ■ 

We prove the uniqueness of the Ffs in 0 in the next Section. 

B. Corresponding systematic encoding 

Considering a point P G V, we have Vj{P + Z) = 
j:iH{Vj,i){P)Z\ and, 

m 

Vj{p + Z) = P[((p, + zy‘> - (P, + 

i=l 

m m 

= n {zf - zy^^=zi n (zf-^ - 1)'\ 
2=1 2=1 

So, we have proved the following 0; 

H(Vi.i)(P)^{^),„ (4) 

When we compute the Hasse Derivative of P, we hnd 
H{F,i)= Y. H{FjVj,i) 

\j\<d/q 

= E E H{Fj,u)H{Vj,v) 

\j\<d/qu+v=j 

H{F,i){P)= E E H{Fj,u){P)H{Vj,v){P). 

\j\<d/q u+v=j,v<j 

Thanks to 0, the summation reduces to 

H{F,i){P)=Y E H{Fj,u){P)H{Vj,v){P) 

j<i 

= (-i)WPi(P)+ 

E E H{Fj,u){P)H{Vj,v){P). 

j<i u-\-v=i,v<j 

(5) 

Thus we can hnd the evaluation of Fi at P G F™ if we know: 
. H{F,i){py, 

• the polynomials Fj for every j < i. 

Now, using the information set Id^ of the Reed-Muller code 
RMd. given by Theorem [T] we can determine Fj given the 
values Fj{P), P G Fdy So the set X: 

p = {ji'^dAj^S ( 6 ) 

enables to hnd Fj from its values on Xd^ ■ Under unicity of 0, 
we have the following : 

Proposition 1: An information set of Mult^ is given by 0. 
Given a message M of length k = over F^, we 

consider the polynomial P G Fq[X]d whose list of coefficients 
is given by M. Then, the classical non-systematic encoding 
of M is ev'*(P) e E". 




For the systematic encoding, we write the message as M = 
(Mj p), where P € Xd^ and |j| < d/q, and we define F to 
be the unique polynomial such that H{F,j){P) = Mj,p. We 
then construct F according to the above discussion : From 
the values F[{F,j){P), we find Fj thanks to (|5]l. Then we 
find F using Q and finally we evaluate F on the remaining 
(j) P) ^ The systematic encoding of M over V is ev®(F). 
We summarize this systematic encoding in Algorithm [T] 


Algorithm 1 Systematic encoding algorithm for multiplicity 
codes_ 

Input: The message M = (Mj.p)(j p)gi of dimension k. 
Output: The systematic encoding of M over V. 

1: Determine recursively the polynomials Fj € with 

\j\ < d/q, using Q where H{F,i){P) is given by 

H{F,i){P)=Mi,p, iGS. 


d, consider the generating function; 



= (l + a;H- \-x‘^ - V ^), 

'-V-' 

m times 


then the coefficient of x'^ of fm{x) is exactly the cardinality 
of ICd, with the convention that JCd = /> when d > m{q — 1). 
From this, we use that kd = \Cd\ = |/Co| + ■ • • + |/Cd|, with; 
\K.d\ = [x‘^]fm{x), \JCd-l\ = [x‘^~^]fmix) = 

[x'^jixfmix)), ■ • ■ , 1^0 I = [Mfmix) = [x‘^]{x‘^ fm{x)). 
Therefore; 


kd = [x'^]ifmix) + Xfm{x) + x"^fm{x) H-h x'^fm{x)) 


= \X 


1 — X 


D+1 


1 — X 


fm{x) = [x‘^] 


fmix) 
1 — X 


2: Compute the polynomial F C Fg[X] as 

F= F 

\j\<d/q 

3: return ev®(F), the systematic encoding of M over V. 


IV. Unicity of the decomposition 


Note that kd = \Xd\- Similarly as above, we have; 
kd = [x‘^]^Y^, kd-q = [x^] , ..., 

kd-rq = 


where d = rq + t and t < q. For every j we have 

such sets {j,Xd )■ By (|6]l, it follows that the size of I is thus 


r 



To have unicity of F constructed from the message 
(Mj p)^^. p^gp, and full correctness of Algorithm [l] the 
following statement suffices. 

Lemma 2: The decomposition ([3 in Lemma [T] is unique. 

Proof: We prove this lemma by showing that the size 

of I defined by (|6]l is exactly the dimension k of the code. 
Assume that d = rq + t, hence r < s — 1 and t < q (since d < 
sq). Recall that the dimension of Reed-Muller codes satisfy 
kd = \Xd\ = \Xd\- There are some particular cases; 

• When d > m{q — 1), kd = g™ 

. When 0<d<q-l,kd = 

. When d<0, kd = 0. 

Since we do not know any closed formula for kd, we use 
generating functions (see ||8], ||9l). First, we give a brief 
introduction. If f{x) = J2n>o ^nx'^, then we call a„ the n-th 
coefficient of x", and denote it by a„ = [x"]/(x). Recall that; 


1 Y^/n + fc-l\ , 
(1 — x)^ ^ V k—1 / 

n>0 


(7) 


Using 


m 

= {(n, ■ ■ ■ jim) \ 0 < ii < q - 1; 1 < I < m; = d}, 

1^1 


which implies 


H-+ 


= X 


m — 1 + r 
m — 1 

iq 

2^i=0 V m-1 jX 


^rq 


1 — X 

Using (|7]i, we have; 

E fm — l + i 
\ m — 1 


1 — X 
fmix)] . 


fmix) 


i >0 


(1 - X?)”"’ 


SO 


|I| = I —^ ^ /^(x) 


1 — X 


= X 


= X 


2^i>0 V m-1 /X 


1 — X 

1 


fmix) 

fmix) 


(l_x9)"i(l-x) 

1 /l-x? 

(1 — x'J)”^(l — x) \ 1 — X 

1 


(1 - x)'"+i 
m + d 


m 


= k, 


we have a one-to-one mapping between elements (zi,..., im) 
G K-d and monomials x^^x*^ ...x*"* of total degree d and 
individual degree not greater than q — 1. Hence, for a degree 
























as we wanted to prove. To conclude the proof, we consider 

^ v,[xu 

Lemma 1 shows that </> is surjective. Since we have just proved 



the equality of dimensions of the range and of the domain 
implies that tjj is bijective, in particular one-to-one. ■ 

Note that from Equation ([8]l, we can compute easily the 
value of kd recursively from kd-iq’s where 0 < z < d/q. 

V. Systematic encoding for Derivative Codes 

In this Section, we apply the previous results to the particu¬ 
lar case of m = 1. This boils down to codes generalizing Reed- 
Solomon codes, using derivatives. These codes have been used 
in Eol, where they were given the name of Derivative Codes. 
Let be given s and d as in Definition |3] In this case, the 
information sets Xd^ are 

Id, = {* I 0 < z < dj } , j = 0,..., s - 1. 

The systematic encoding is described in Algorithmic 


Algorithm 2 Systematic encoding algorithm for Derivative 
codes_ 

Input: The message M = (Mi^p) of dimension k, where 
P G Idi and i < s. 

Output: The systematic encoding of M over F,. 

I: Find the polynomials Fi G FglX] where i < s, such that: 

= (-1)*F,(P)+ 

i — 1 i 

+ E Y.H{FqP-v){P)H{V,,v){P) 
i=o v=j 

2: Define the polynomial F G Fg[Jf] as 

j<s 

where Vj = (X« - X)L 

3: return ev(F), the systematic encoding of M over F,. 


VI. Conclusion 

We have defined the notion of information set for Multi¬ 
plicity codes as Fg-linear codes. We filled in details of the 
work of Kopparty 0, who introduced a systematic encoding 
for such codes. Our work also allowed us to propose a new 
recursive formula for the size of Reed-Muller codes over 
Fq, that makes use of a combinatorial proof of generating 
functions. Designing efficient algorithms for fast systematic 
encoding will be the topic of future work. 


VII. Acknowledgment 

The third author would like to thank Doron Zeilberger and 
Louis Joseph Billera for the suggestion of using generating 
functions in Section IV. 

References 

[1] J. Katz and L. Trevisan, “On the Efficiency of Local Decoding Proce¬ 
dures for Error-con'ecting Codes,” in Proceedings of the Thirty-second 
Annual ACM Symposium on Theory of Computing, STOC ’00, F. Yao 
and E. Luks, Eds. ACM, 2000, pp. 80-86. 

[2] S. Yekhanin, Locally Decodable Codes, ser. Foundations and Trends in 
Theoretical Computer Science. NOW publisher, 2012, vol. 6. 

[3] S. Kopparty, S. Saraf, and S. Yekhanin, “High-rate Codes with Sublinear¬ 
time decoding,” in Proceedings of the Forty-third Annual ACM Sympo¬ 
sium on Theory of Computing, STOC'11, S. Vadhan, Ed. New York, 
USA: ACM, 2011, pp. 167-176. 

[4] D. Augot, F. Levy-dit-Vehel, and A. Shikfa, “A storage-efficient and 
robust private information retrieval scheme allowing few servers,” in 
Cryptology and Network Security - 13th International Conference, 
CANS 2014, Heraklion, Crete, ser. Lecture Notes in Computer Science. 
Springer, 2014, pp. 222—239. 

[5] S. Kopparty, “List-decoding multiplicity codes,” Electronic Colloquium 
on Computational Complexity (ECCC), vol. TR12-044, 2012. 

[6] T. Kasami, S. Lin, and W. Peterson, “New generalizations of the Reed- 
Muller codes. I. Primitive codes,” IEEE Trans. Information Theory, 
vol. 14, no. 2, pp. 189-199, 1968. 

[7] J. Key, T. McDonough, and V. Mavron, “Information sets and partial 
permutation decoding for codes from finite geometries,” Finite Fields 
and Their Applications, vol. 12, no. 2, pp. 232-247, Apr. 2006. 

[8] R. P. Stanley, Enumerative Combinatorics, ser. Cambridge Studies in 
Advanced Mathematics. Cambridge University Press, 2011, vol. 1. 

[9] H. S. Wilf, Generatingfunctionology. A. K. Peters, Ltd. Natick, 2006. 

[10] V. Guruswami and C. Wang, “Linear-algebraic list decoding for variants 

of Reed-Solomon codes,” Information Theory, IEEE Transactions on, 
vol. 59, no. 6, pp. 3257-3268, Jun. 2013. 

Appendix 

Complexity estimates 

We give a rough and conservative estimate on the number 
of arithmetic operations in Fg needed for systematic encoding. 
Algorithm [T] finds a unique polynomial F G Fq[X] from the 
Li’s, those Fj’s being found from the Fg-symbols Mj^p at 
the {j,P) G I = {j,Xdj )then it evaluates back this 
polynomial F for {j,P) ^X. But Q requires expensive mul¬ 
tiplications of multivariate polynomials. Yet Q also enables 
to bypass the computation of F, working only with Lj’s, as 
follows. At step i, a first pass consists in going through the 
points P G Xd^ to compute Fi{P). Then Fi G Fq[X]di is 
uniquely determined by its values on the information set • 
Note that F, can be computed by applying the (precomputed) 
inverse of ip defined in ([T]), i.e. a matrix-vector product of 
cost 0{kd.). Once Fi is computed, using Q again, the values 
H{F,i){P), for P ^ are computed. With cr = IS”!, we 
have, for each i G S': 

1) for each P G Xd^, 0{a^) for computing Fi{P) us¬ 
ing 0; thus a total of kdiC^ for all P G Xdp, 

2) 0{k^ ,) for recovering Fi, using a matrix-vector product; 

3) 0{akdi) for computing the a Hasse derivatives of Fi, 
(termwise on Fi, step-by-step through S); 

4) 0{n) for at once evaluating Fi on all P ^ 
neglecting logarithmic factors (multidimensionnal EFT) 

5) for each P ^ 1^^, O(cr^) for computing each 
H(F,i){P) using Q again, for a total of (n — kd^jcr^. 





Summing over the i G S, we get a “soft-O” estimate 
of O {Eies na'^ + = 0(ncr^ + fc^), with a memory 

footprint of 0{an) for storing all the Fj’s and their Hasse 
derivatives. Note that an is the size of the output codeword. 


